Minimum Communication Cost for Joint Distributed 
Source Coding and Dispersive Information Routing 

Kumar Viswanatha, Student Member, IEEE, Emrah Akyol, Student Member, IEEE 

and Kenneth Rose, Fellow, IEEE 



Abstract — This paper considers the problem of minimum cost 
communication of correlated sources over a network with multi- 
ple sinks, which consists of distributed source coding followed by 
routing. We introduce a new routing paradigm called dispersive 
information routing, wherein the intermediate nodes are allowed 
to 'split' a packet and forward subsets of the received bits on 
each of the forward paths. This paradigm opens up a rich class of 
research problems which focus on the interplay between encoding 
and routing in a network. Unlike conventional routing methods 
such as in |1 1, dispersive information routing ensures that each 
sink receives just the information needed to reconstruct the 
sources it is required to reproduce. We demonstrate using simple 
examples that our approach offers better asymptotic performance 
than conventional routing techniques. This paradigm leads to 
a new information theoretic setup, which has not been studied 
earlier. We propose a new coding scheme, using principles from 
multiple descriptions encoding |2| and Han and Kobayashi 
decoding |3|. We show that this coding scheme achieves the 
complete rate region for certain special cases of the general setup 
and thereby achieves the minimum communication cost under 
this routing paradigm. 

Index Terms — Distributed source coding. Minimum cost rout- 
ing. Compression of correlated sources 

I. Introduction 

Compression of sources in conjunction with communication 
over a network has been an important research area, notably 
with the recent advancements in distributed compression of 
correlated sources and network (routing) design, coupled 
with the deployment of various sensor networks. Encoding 
correlated sources in a network, such as a sensor network 
with multiple nodes and sinks as shown in Fig. [T] has 
conventionally been approached from two different directions. 
The first approach is routing the information from different 
sources in such a way as to efficiently re-compress the data 
at intermediate nodes without recourse to distributed source 
coding (DSC) methods (we refer to this approach as joint 
coding via 'explicit communication'). Such techniques tend 
to be wasteful at all but the last hops of the communication 
path. The second approach performs DSC followed by simple 
routing. Well designed DSC followed by optimal routing can 
provide good performance gains. We will focus on the latter 
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Fig. 1 : A general multi-source multi-sink sensor network. The 
circles denote sources and stars denote sinks. The arrows 
denote allowed communication links. 



category. Relevant background on DSC and route selection in 
a network is given in the next section. 

This paper focuses on minimum cost communication of 
correlated sources over a network with multiple- sinks. We 
introduce a new routing paradigm called Dispersive Informa- 
tion Routing (DIR), wherein intermediate nodes are allowed 
to "split a packet" and forward a subset of the received bits 
on each of the forward paths. This paradigm opens up a 
rich class of research problems which focus on the interplay 
between encoding and routing in a network. What makes it 
particularly interesting is the challenge in encoding sources 
such that exactly the required information is routed to each 
sink, to reconstruct the prescribed subset of sources. We will 
show, using simple examples that asymptotically, DIR achieves 
a lower cost over conventional routing methods, wherein the 
sinks usually receive more information than they need. This 
paradigm leads to a general class of information theoretic 
problems, which have not been studied earlier. In this paper, 
we formulate this problem and the associated rate region. We 
introduce a new (random) coding technique using principles 
from multiple descriptions encoding and Han and Kobayashi 
decoding, which leads to an achievable rate region for this 
problem. We show that this achievable rate region is complete 
under certain special scenarios. 

The rest of the paper is organized as follows. In Section 
[n| we review prior work related to distributed source coding 
and network routing. Before stating the problem formally, in 
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Section |Ill| we provide 2 simple examples to demonstrate the 
basic principles behind DIR and the new encoding scheme. 
We also demonstrate the suboptimality of conventional routing 
methods using these simple examples. In Section llV) we 



formally state the DIR problem and provide an achievable rate 
region. Finally, in Section |V| we show that this achievable rate 
region is complete for some special cases of the setup. 

II. Prior Work 

Multi-terminal source coding has one of its early roots in 
the seminal work of Slepian and Wolf |4 |. They showed, in 
the context of lossless coding, that side-information available 
only at the decoder can nevertheless be fully exploited as if 
it were available to the encoder, in the sense that there is 
no asymptotic performance loss. Later, Wyner and Ziv |5| 
derived a lossy coding extension that bounds the rate-distortion 
performance in the presence of decoder side information. Ex- 
tensive work followed considering different network scenarios 
and obtaining achievable rate regions for them, including |[6|- 
|T4| . Han and Kobayashi \T] extended the Slepian- Wolf result 
to general multi-terminal source coding scenarios. For a multi- 
sink network, with each sink reconstructing a prespecified 
subset of the sources, they characterized an achievable rate 
region for lossless reconstruction of the required sources at 
each sink. Csiszar and Korner |15 | provided an alternative 
characterization of the achievable rate region for the same 
setup by relating the region to the solution of a class of 
problems called the "entropy characterization problems". 

There has also been a considerable amount of work on 
joint compression-routing for networks. A survey of routing 
techniques for sensor networks is given in p6| . It was shown 
in 1 17 1 that the problem of finding the optimum route for 
compression using explicit communication is an NP-complete 
problem. |18| compared different joint compression-routing 
schemes for a correlated sensor grid and also proposed an 
approximate, practical, static source clustering scheme to 
achieve compression efficiency. Much of the above work is 
related to compression using explicit communication, without 
recourse to distributed source coding techniques. Cristescu et 
al. |[T| considered joint optimization of Slepian- Wolf coding 
and a routing mechanism, we call 'broadcasting Q wherein 
each source broadcasts its information to all sinks that intend 
to reconstruct it. Such a routing mechanism is motivated from 
the extensive literature on optimal routing for independent 
sources |19 |. |20| proved the general optimality of that ap- 
proach for networks with a single sink. We demonstrated its 
sub-optimality for the multi-sink scenario, recently in pT| . 
This paper takes a step further towards finding the best joint 
compression-routing mechanism for a multi-sink network. We 
note that a preliminary version of our results appeared in p2| 
and |23 |. 

We note the existence of a volume of work on minimum 
cost network coding for correlated sources, e.g. p4| , | [25| . 
But the routing mechanism we introduce in this paper does 

^Note that we loosely use the term 'broadcasting' instead of 'multicasting' 
to stress the fact that all the information transmitted by any source is routed 
to every sink that reconstructs the source. Also, our approach to routing is in 
some aspects, a variant of multicasting. 
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Fig. 2: Example 1 - Conventional Routing 



not require possibly complex network coders at intermediate 
nodes, and can be realized using simple conventional routers. 
The approach does have potential implications on network 
coding, but these are beyond the scope of this paper. 

III. Dispersive Information Routing - Simple 
Networks 

A. Basic Notation 

We begin by introducing the basic notation. In what follows, 
2*^ denotes the set of all subsets (power set) of any set S 
and \S\ denotes the set cardinality. Note that 12*^1 = 21*^1. 

denotes the set complement (the universal set will be 
specified when there is ambiguity) and (j) denotes the null 
set. For two sets Si and ^2, we denote the set difference 
by 5i — ^2 = {s : s G ^ <S2}. Random variables 

are denoted by upper case letters (for example X) and their 
realizations are denoted by lower case letters (for example x). 
We also use upper case letters to denote source nodes and 
sinks and the ambiguity will be clarified wherever necessary. 
A sequence of n independent and identically distributed (iid) 
random variables and its realization are denoted by and 
x^, respectively. The length n, e-typical set is denoted by 
TJ^. X ^ Y ^ Z denotes that the three random variables 
(X, F, Z) form a Markov chain in that order. Notation in |26| 
is used to denote standard information theoretic quantities. 

B. Illustrative example - No helpers case 

Consider the network shown in Fig. [2] There are three 
source nodes, Eq, Ei and E2 and two sinks Si and 5*2. The 
three source nodes observe correlated memoryless sequences 
and X2, respectively. Sink Si reconstructs the pair 
(XQ,Xf), while S2 reconstructs (Xq,X2). Eq communi- 
cates with the two sinks through an intermediate node (called 
the 'collector') which is functionally a simple router. The edge 
weights on each path in the network are as shown in the 
figure. The cost of communication through an edge, e, is a 
function of the bit rate flowing through it, denoted by Re and 
the corresponding edge weight, denoted by We, which in this 
paper, we will assume for simplicity to be a simple product 
C{Re^We) = ReWe, noting that the approach is directly 
extendible to more complex cost functions. We further assume 
that the total cost is the sum of individual communication 
cost over each edge. The objective is to find the minimum 
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total communication cost for lossless transmission of sources 
to the respective sinks. 

We first consider the communication cost when broadcast 
routing is employed |[T| wherein the routers forward all the 
bits received from a source to all the decoders that would 
reconstruct it. In other words, routers are not allowed to "split" 
a packet and forward a portion of the received information on 
the forward paths. Hence the branches connecting the collector 
to the two sinks carry the same rates as the branch connecting 
£^0 to the collector. We denote the rate at which Xq, Xi and 
X2 are encoded by R^, R\ and R2, respectively. 

Using results in |1|, it can be shown that the minimum 
communication cost under broadcast routing is given by the 
solution to the following linear programming formulation: 

C^r = mm{{Wo + 1^1 + W2)Ro + WuRi + W22R2} (1) 
under the constraints: 

Ro > m^x{H{Xo\Xi),H{Xo\X2)) 

Ri > H{Xi\Xo) 

R2 > H{X2\Xo) 

Ri -\- Rq > H{Xo^Xi) 

R2^Ro > H{Xo,X2) (2) 

To gain intuition into dispersive information routing, we will 
later consider a special case of the above network when the 
branch weights are such that Wii,W22 <C Wo, ^1,^2- Let 
us specialize the above equations for this case. The constraint 
^11,^22 < Wo, 1^1,1^2, implies that Xi and X2 should 
be encoded at rates Ri = H{Xi) and R2 — H{X2), 
respectively. Therefore the scenario effectively captures the 
case when Xi and X2 are available as side information at 
the respective decoders. It follows from ([T]) and ^ that for 
achieving minimum communication cost, Rq is: 

=max{i7(Xo|Xi),i7(Xo|X2)} (3) 

and therefore the minimum communication cost is given by: 
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CI = {Wo^Wi^W2)R^o 

^WiiH{Xi)^W22H{X2) 



(4) 



Is this the best we can do? The collector has to transmit enough 
information to sink Si for it to decode Xo and therefore the 
rate is at least i^(Xo|Xi). Similarly the rate on the branch 
connecting the collector to S2 is at least i^(Xo|X2). But if 
i^(Xo|Xi) 7^ i^(Xo|X2), there is excess rate on one of the 
branches. 

Let us now relax this restriction and allow the collector node 
to "split" the packet and route different subsets of the received 
bits on the forward paths. We could equivalently think of the 
source ^0 transmitting 3 smaller packets to the collector; the 
first packet has a rate i^o,{i,2} bits and is destined to both 
sinks. Two other packets have rates Rq^i and i?o,2 and are 
destined to sinks and 5*2, respectively. Technically, in this 
case, the collector is again a simple conventional router. 

We refer to such a routing mechanism, where each inter- 
mediate node transmits a subset of the received bits on each 
of the forward paths, as "Dispersive Information Routing'' 
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Fig. 3: Example - DIR. Note that the notion of 'packet 
splitting' is equivalent to the sources transmitting multiple 
smaller packets 



(DIR). Note that unlike network coding, DIR does not re- 
quire possibly expensive coders at intermediate nodes, and 
can always be realized using conventional routers, with each 
source transmitting multiple packets into the network intended 
to different subsets of sinks. Hereafter, we interchangeably 
use the ideas of "packet splitting" at intermediate nodes and 
conventional routing of smaller packets, noting the equivalence 
in achievable rates and costs. This scenario is depicted in Fig. 
|3] with the modified cost each packet encounters. 

Two obvious questions arise - Does DIR achieve a lower 
communication cost compared to conventional routing? If so, 
what is the minimum communication cost under DIR? 

We first aim to find the minimum cost using DIR under the 
special case of Wn, W22 <^Wo,Wi,W2 (i.e., Ri = H{Xi) 
and R2 = H{X2)). To establish the minimum communication 
cost we need to first establish the complete achievable rate 
region for the rate tuple {i^o,i' ^o,{i,2}7 ^0,2} for lossless re- 
construction of Xq at both the decoders and then find the point 
in the achievable rate region that minimizes the total commu- 
nication cost, determined using the modified weights shown in 
Fig. [3] Before deriving the ultimate solution, it is instructive to 
consider one operating point. Pi = {i^o,i, ^0 {1 2)5 ^0,2} = 
{/(Xi;Xo|X2),i^(Xo|Xi,X2),/(X2;Xo|Xi)} and provide 
the coding scheme that achieves it. Extension to other "inter- 
esting points" and to the whole achievable region follows in 
similar lines. This particular rate point is considered first due 
to its intuitive appeal as shown in a Venn diagram (Fig. |4^). 

Gray and Wyner considered a closely resembling net- 
work fT3l shown in Fig. [5] In their setup, the encoder 
observes iid sequences of 2 correlated random variables 
(Xi, X2) and transmits 3 packets (at rates i?o,i, ^o,{i,2}7 ^0,2, 
respectively), one meant for each subset of sinks. The 
two sinks reconstruct sequences Xf and X2, respectively. 
They showed that the rate tuple {i^o,i,^o,{i,2}7^o,2} = 
{i^(Xi|X2), /(Xi; X2), i^(X2|Xi)} is not achievable in gen- 
eral and that there is a rate loss due to transmitting a 
common bit stream; in the sense that individual decoders 
must receive more information than they need to reconstruct 
their respective sources if the sum rate is maintained at 
minimum. Wyner defined the term "Common Information" 
1 11 1, here denoted by Cvf(Xi;X2) as the minimum rate 
^o,{i,2} such that {i^o,i, ^o,{i,2}7 ^0,2} is achievable and 
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Fig. 4: Venn Diagram based intuition: (a) Amount of informa- 
tion routed using DIR when operating at point Pi. Observe 
that each of the sinks receive information at the respective 
minimum rates. Green represents i^o,i2, Blue represents i?o,i 
and Red represents i?o,2 (b) Intuitive representation of Wyner's 
common information. Observe that in Wyner's setup, it is 
generally not possible to split the information exactly and that 
there is a rate loss due to transmitting the common bit stream. 




Fig. 5: Gray-Wyner Setup. Note the resemblance to the DIR 
setup in Fig. [3] 



i?o,i + ^o,{i,2} + ^0,2 = H{Xi^X2). He also showed that 
C{Xi]X2) = min/(Xi,X2; U) where the min is taken over 
all auxiliary random variables U such that Xi ^ U ^ X2 
form a Markov chain. He further showed that, in general, 
/(Xi;X2) < C^(Xi;X2) < max(i7(Xi), //(Xs)). We note 
in passing, the existence of an earlier definition of common 
information by Gacs and Korner p7| which measures the 
maximum shared information that can be fully utilized by 
both the decoders. It is less relevant to dispersive information 
routing. 

At first glance, it might be tempting to extend Wyner's 
argument to the DIR setting and say Pi is not achievable 
in general, i.e., each decoder has to receive more information 
than it needs. But interestingly enough, a rather simple coding 
scheme achieves this point and simple extensions of the coding 
scheme can achieve the entire rate region for this example. The 
primary difference between Gray-Wyner network and DIR is 
that in their setup two correlated sources are encoded jointly 
for separate decoding at each sink. However, in our setup, Xq 
is encoded for lossless decoding at both the sinks. Note that 
this section only provides intuitive arguments to support the 
result. A coding scheme will be formally derived in section 
IVj for the general setup. 

We concentrate on encoding at £^0 assuming that Ei and 
E2 transmit at their respective source entropies. Eq observes 
a sequence of n iid random variables Xq . This sequence 
belongs to the typical set, 7^^, with high probability. Every 
typical sequence is assigned 3 indices, each independent of 
the other. The three indices are assigned using uniform pmfs 



over [1 : 2''^o>i], [1 : 2^^0'{i'2}] and [1 : 2^^°'^], respectively. 
All the sequences with the same first index, mo,i, form a 
bin So, 1(^0,1). Similarly bins 60,2(^0,2) and So, 12(^0, 12) 
are formed for all indices mo, 2 and mo, 12, respectively. Upon 
observing a sequence Xq G TJ^ with indices mo, 1, mo, 2 and 
mo, 12, the encoder transmits index mo,i to decoder 1 alone, 
index mo, 2 to decoder 2 alone and index mo, 12 to both the 
decoders. 

The first decoder receives indices mo,i and mo, 12. It tries 
to find a typical sequence Xq G So,i(mo,i) fl 60,12(^0,12) 
which is jointly typical with the decoded information sequence 
x^. As the indices are assigned independent of each other, 
every typical sequence has uniform pmf of being assigned 
to the index pair {mo, 1, mo, 12} over [1 : 2'''^^0'i+^0'-ti'2})]. 
Therefore, having received indices mo,i and mo, 12, using 
arguments similar to Slepian-Wolf |4| and Cover |7|, the 
probability of decoding error asymptotically approaches zero 
if: 



^0,l+i^0,{l,2} >i^(Xo|Xi) 



(5) 



Similarly, probability of decoding error approaches zero at the 
second decoder if: 



^0,2 + ^0,{l,2} > ^(-^o|-^2) 



(6) 



Clearly dSj and ([6]) imply that Pi is achievable. In similar lines 
to ||4|, (TJ, the above achievable region can also be shown to 
satisfy the converse and hence is the complete achievable rate 
region for this problem. We term such a binning approach 
as 'Power Binning' as an independent index is assigned to 
each (non-trivial) subset of the decoders - the power set. It is 
worthwhile to note that the same rate region can be obtained 
by applying results of Han and Kobayashi |3|, assuming 3 
independent encoders at Eq, albeit with a more complicated 
coding scheme involving multiple auxiliary random variables 
(see also |[28j). We also note that the mechanism of assigning 
multiple independent random bin indices has been used is 
several related prior work, such as p9| , jSOj. 

The minimum cost operating point is the point that satisfies 
equations ([5]) and ([6]) and minimizes the cost function: 



DIR-SI 



min {(WQ^Wi)RQ^i^(WQ^W2)RQa 
+(l^o + Wi + W2)i^o,{i,2}} (7) 



The solution is either one of the two points 
P2 ^ {0,i/(Xo|Xi),i7(Xo|X2) - i/(Xo|Xi)} or 
P3 ^ {i^(Xo|Xi) - i7(Xo|X2),i/(Xo|X2),0} and both 
achieve lower total communication cost compared to broadcast 
routing, C*^^^ in Q, for any 1^0,^1, > I^i 1,^^22 if 
i7(Xo|Xi)^i7(Xo|X2). 

The above coding scheme can be easily extended to the 
case of arbitrary edge weights. Then, the rate region for the 
tuple {i?i, P2, ^0,1, ^o,{i,2}7 ^0,2} and the cost function to 
be minimized are given by: 



^D/i? = min {^^11^1+1^22^^2 + (Wo + Wi)i^o,i 

+ W2)Roa + (Wo + Wi + W2)i^o,{l,2}} (8) 
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Fig. 6: The 2 Source - 2 Sink example. Each source acts as 
the principle source for one sink and as a helper for the other 



under the constraints: 

i?i > i/(Xi|Xo) 
^0,1 + ^0, {1,2} ^ H{Xo\Xi) 

+ i?0,l + ^0,{l,2} ^ H{Xq^Xi) 

R2 > H{X2\Xo) 

^0,2 + ^0, {1,2} ^ H{Xo\X2) 
i^2 + i^0,2 + i^0,{l,2} > i^(Xo,X2) (9) 

If Ri = H{Xi) and R2 = H{X2), ^ specializes to ^ and 
([6]). Also, it can easily be shown that the total communication 
cost obtained as a solution to the above formulation is lower 
than that for conventional routing if Wo, ^1,^2 > 0. This 
example clearly demonstrates the gains of DIR over broadcast 
routing to communicate correlated sources over a network. 

Observe that in the above example, the sinks only receive 
information from the source nodes they intend to reconstruct. 
Such a scenario is called the 'No helpers' case in the literature 
p5| . In a network with multiple sources and sinks, if source 
i is to be reconstructed at a subset of sinks 11^, power binning 
assigns 2l^^l — 1 independently generated indices, each being 
routed to a subset of 11^ . It will be shown later in section |V| 
that power binning achieves minimum cost under DIR, even 
for a general setup, as long as there are no helpers, i.e., when 
each sink is allowed to receive information only from the 
requested sources. However, the problem of establishing the 
complete achievable rate region becomes considerably harder 
when every source is allowed to communicate with every sink, 
a scenario, that is highly relevant to practical networks. It 
was shown in | [2T| that for certain networks, unbounded gains 
in communication cost are obtained when source nodes are 
allowed to communicate with sinks that do not reconstruct 
them. In this paper, we derive an achievable rate region for 
this setup. In the following subsection, to keep the notations 
and understanding simple, we begin with one of the simplest 
setups which illustrates the underlying ideas. 

C. A simple network with helpers 

We will again provide only intuitive description for the 
encoding scheme here and defer the formal proofs for the 



respectively. The source nodes can communicate with the sinks 
only through a collector node. The edge weights are as shown 
in the figure. Observe that, each source, while requested by 
one sink, acts as helper for the other. 

Under dispersive information routing, each source trans- 
mits a packet to every subset of sinks. In this example, Ei 
sends 3 packets to the collector at rates (i^i,i, i^i,2, ^1,12), 
respectively. The collector forwards the first packet to , the 
second to ^2 and the third to both Si and 82- Similarly, E2 
sends 3 packets to the collector at rates (i^2,i, ^2,2, ^2,12) 
which are forwarded to the corresponding sinks. Our ob- 
jective is to determine the set of achievable rate tuples 
i?i^2, ^1,12, ^2,1, ^2,2, ^2,12) that allows for lossless 
reconstruction at the two sinks. The minimum cost then 
follows by finding the point in the achievable rate region which 
minimizes the effective communication cost, Cdir, given by: 
2 

^(W,e + Wcl + WM,12 + {W2c + Wcl)R2,l 

i=l 

2 

+ Y.{W,c + Wc^)R^,^ + {Wic + Wc2)Rl,2 (10) 
i=l 

A non- single letter characterization of the complete rate 
region is possible using the results of Han and Kobayashi in 
|[3j. They also provide a single-letter partial achievable rate 
region. However, applicability of their result requires artificial 
imposition of 3 independent encoders at each source, which is 
an unnecessary restriction. We present a more general achiev- 
able rate region, which maintains the dependencies between 
the messages at each encoder. Note that the source coding 
setup which arises out of the DIR framework is a special case 
of the general problem of distributed multiple descriptions 
and therefore the principles underlying the coding schemes 
for distributed source coding |[3| and multiple descriptions 
encoding | 2 1 play crucial roles in deriving a coding mechanism 
for dispersive information routing. It is interesting to observe 
that, unlike the general MD setting, the DIR framework is non- 
trivial even in the lossless scenario and deriving a complete 
rate region for lossless reconstruction at all the sinks is a 
challenging problem. 

We now give an achievable region for the exam- 
ple in Fig. |6] Suppose we are given random variables 
(^1,12,^1,1,^1,2,^2,12,^2,1,^2,2) jointly distributed with 
(Xi,X2) such that the following Markov chain conditions 
hold: 



(t/l,i2,/7i,i) ^ Xi^X2^(/72,i2,/72,l) 
(t^l,12,/^l,2) ^ Xi ^ X2 ^ {U2,12,U2,2) 



(11) 



general case to section IV Consider the network shown in 
Fig. [6] Two source nodes Ei and E2 observe correlated 
memoryless sequences and X2 , respectively. Two sinks 
and S2 require lossless reconstructions of X^ and X2, 



Note that the codeword indices of Ui^s are sent in the packet 
from source Ei to sinks Sj : j e S. The encoding is divided 
into 3 stages. 

Encoding : We first focus on the encoding at Ei. In the 
first stage, 2"^^i'i2 codewords of t/1,12, each of length n are 
generated independently, with elements drawn according to 
the marginal density P(/7i,i2). Conditioned on each of these 
codewords, 2'^^i'i and 2''^i'2 codewords of /7i,i and /7i,2 are 
generated according to the conditional densities P(/7i,i|/7i,i2) 
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and 21^^1,12), respectively. Codebooks for t/2, 12,^^2,1 

and [/2,2 are generated at E2 in a similar fashion. On observing 
a sequence x^, Ei first tries to find a codeword tuple from the 
codebooks of (/7i,i2, /7i,2) such that (^1 , 't^i 127 '^1 1) ^ 
and (x? 

, 1^2^ 22 5 ^1 2) ^ "^he probability of finding such 
a codeword tuple approaches 1 if, 

^1,12 ^ l{Xi]Ui^i2) 

Ri,i > /(Xi;/7i,i|/7i,i2) 

r[o > /(Xi;/7i,2|/7i,i2) (12) 



Let the codewords selected be denoted by (1^1,12,^1,1,^1,2). 
Similar constraints on (i?2 1, i?2 2^ ^2 12) must be 
satisfied for encoding at E2. Denote the codewords 
selected at E2 by (1^2,12,1^2,1,1^2,2)- It follows 
from ( pT| ) and the 'Conditional Markov Lemma' in 
flo) that (x^,x^, 1^1,12, 1^1,1, 1^2,12, 1^2,1) ^ TJ^ and 
(xy,x^, 1/1,12, 1^1,2, 1^2,12, 1^2,2) ^ TJ^ with high probability. 

In the second stage of encoding, each encoder uniformly 
divides the 2""^^ codewords of Ui^s into 2"^^* "S bins Vi G 
{1,2}, S e {1,2,12}. All the codewords which have the 
same bin index m are said to fall in the bin Ci,5(m) 
Vm G (1 . . . 2^^^"S). Note that the number of codewords in 
bin Ci^s{m) is 2"^^^^ 5-^^,5). if Ei selects the codewords 
(1^1,12,1^1,1,1^1,2) in the first stage and if the bin indices as- 
sociated with (1^1,12,1^1,1,1^1,2) are (mi, 12, mi, 1, mi, 2), then 
index mi,i is routed to sink ^i, mi, 2 to sink ^2 and 
mi, 12 to both the sinks ^i and 82- Similarly, bin indices 
(m2,i2, m2,i, m2,2) are routed from E2 to the corresponding 
sinks. 

The third stage of encoding, resembles the 'Power Binning' 



scheme described in Section |in-B[ Every typical sequence of 
Xi is assigned a random bin index uniformly chosen over 
[1 : 2"^^^'^]. All sequences with the same index, /i,i, form a bin 
Si,i(/i,i) V/1,1 G {1 . . . 2^^^'^}. Upon observing a sequence 
G TJ^ with bin index /i,i, in addition to mi,i (from the 
second stage of encoding), encoder Ei also routes index /i,i 
to sink ^1. Similarly bin index ^2,2 is routed from E2 to ^2 in 
addition to m2,2. These bin indices are used to reconstruct Xf 
and X2 losslessly at the respective decoders. Note that, in a 
general setup, if source i is to be reconstructed at a subset of 
sinks n^, the source assigns 2l^^l — 1 independently generated 
indices, each being routed to a subset of 11^. We also note that 
/7i,i and 1/2^2 can be conveniently set to constants without 
changing the overall rate region. However, we continue to use 
them to avoid complex notation. 

Decoding : We again focus on the first sink Si. It receives 
the indices (mi,i2, mi,i, m2,i2, m2,i, /i,i). It first looks for a 
pair of unique codewords from Ci, 12 (mi, 12) and C2,i2(m2,i2) 
which are jointly typical. Obviously, there is at least one pair, 
(1/1,12, 1^2,12), which is jointly typical. The probability that no 
other pair of codewords are jointly typical approaches 1 if: 

(^1,12 ~ ^1,12) + (^2,12 ~ ^2,12) ^ ^(^1,12; ^2,12) (13) 

Noting that (i^'1,12 - ^1,12) ^ ^nd (^^2, 12 ~ ^2,12)^ 0' and 
applying the constraints on R112 ^nd 12 ^^ni ( 12) we get 



the following constraints for R112 and 12- 

<12 > /(Xi;t/l,12|t/2,12) 
<12 > /(^2;/72,l2|t/i,i2) 
Kl2+4',12 > A^l,^2;t/i,i2,/72,l2) (14) 

The decoder at ^i next looks at the codebooks of /7i,i and 
1/2,1 which were generated conditioned on 1^1,12 and 1x2,12, re- 
spectively, to find a unique pair of codewords from Ci,i(mi,i) 
and C2,i(m2,i) which are jointly typical with (1^1,12,1^2,12)- 
We again have one pair, (1/1,1,1^2,1), which is jointly typical 
with (1^1,12,1^2,12)- It can be shown using arguments similar 
to |[3| that the probability of finding no other jointly typical 
pair approaches 1 if : 



(^1,1 - ^1,1) ^ 

(i?2,l - ^2,1) ^ 

+(^2,1 - ^2,1)} 



On substituting the constraints for R^ ^ and R12 from (|12|, 
and using the Markov chain condition in ^TT) we get: 



/(t/l,i;t/2,l,f/2,12|t/l,12) 
/(t/2,i;t/l,l,t/l,12|t/2,12) 
i^(/7i,i|[/i,i2)+i^(/72,l|/72,i2) 
-i^(/7i,i,/72,l|/7i,i2,/72,i2) (15) 



R,^, > /(Xi;[/i,i|[/i,i2,[/2,12,t/2,l) 
R'I^ > /(X2;/72,l|/7i,i2,t/2,i2,t/i,i) 

<i+<i > /(Xi,X2;/7i,i,[/2,i|/7i,i2,t/2,i2)(16) 

After successfully decoding the codewords 
(lii, 12, 1^1,1, 1^2,12, 1^2,1), the decoder at ^i looks for a 
unique sequence from Si,i(/i,i) which is jointly typical 
with (1^1,12,1^1,1,1^2,12,1^2,1)- We again have satisfying 
this property. It can be shown that the probability of 
finding no other sequence which is jointly typical with 
(1^1,12,1/1,1,1^2,12,1^2,1) approaches 1 if: 

^1,1 >i^(Xi|[/i,i2,/72,l2,/7i,i,[/2,l) (17) 

Similar conditions at sink S2 lead to the following constraints: 

^2,2 ^ ^(X2; /72,2|^i,i2, ^2,12, ^1,2) 

RI2 > /(X2;/7i,2|/7i,i2,/72,i2,t/2,2) 
^2,2+^1,2 ^ ^(Xi, X2; /72,2, t^l,2|^l,12, ^2,12) 

^2,2 > i^(X2|/7i,i2,/72,l2,/7i,2,/72,2) (18) 

The first packet from Ei, destined to only Si, car- 
ries indices (mi,i,/i,i) at rate i?i,i = R^ + ^1,1- 
The second and third packets carry mi, 2 and mi, 12 
at rates i?i,2 = R12 and i?i,i2 = Ri 12, respec- 
tively and are routed to the corresponding sinks. Simi- 
larly, 3 packets are transmitted from E2 carrying indices 
{^2,i,m2,i2, (m2,2,(2,2)} at rates (i^2,i, ^2,12, ^2,2) = 
(^2,1^^2,12^^2,2 + ^2,2) to sinks {5'i,6'2, (5'i,6'2)}, respec- 
tively. Constraints for (i^i,i, i^i,2, ^1,12, ^2,1, ^2,2, ^2,12) 
can now be obtained using (p4|),(p^, ^Tf) and ([18]). The convex 
closure of achievable rates over all such random variables 
(t^i,i2, ^1,1, t^i,2, t^2,i2, t^2,i, ^^2,2) gives the achievable rate 
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region for the 2 source - 2 sink DIR problem. It is easy 
to verify that this region subsumes the region that would be 
produced by employing the approach of Han and Kobayashi 
|[3|, which must assume three independent encoders at each 
source. Observe that in the above illustration, we assumed 
that the decoding is performed in a sequential manner, i.e., the 
codewords of t/1,12 are decoded first followed by the code- 
words of {Ui^i) and {Ui^2), respectively. This was done only 
for the ease of understanding. In Theorem [T] we derive the 
conditions on rates for the decoders to find typical sequences 
from all the codebooks jointly (at once). Note that conditions 
on the rates for joint decoding is generally weaker (the region 
is larger) than that for sequential decoding. 

IV. Dispersive Information Routing - General 
Setup 

Let a network be represented by an undirected connected 
graph G = {V^£)- Each edge e e £ is associated with 
an edge weight. We- The communication cost is assumed 
to be a simple product of the edge rate and edge weighj^ 
i.e., Ce = ReWe- The nodes V consist of N source 
nodes (denoted by Ei, E2 . . . En), M sinks (denoted by 
81,82 • 8m), and \V\-N-M intermediate nodes. We define 
the sets S = {1...A/'} and 11 = {1...M}. Source node 
Ei observes n iid random variables X^, each taking values 
over a finite alphabet Xi. Sink Sj reconstructs (requests) a 
subset of the sources specified by C H. Conversely, source 
node Ei is reconstructed at a subset of sinks specified by 
C n. The objective is to find the minimum communication 
cost achievable by dispersive information routing for lossless 
reconstruction of the requested sources at each sink when 
every source node can (possibly) communicate with every 
sink. 

A. Obtaining the effective costs 

Under DIR each source transmits at most 2^ — 1 packets 
into the network, each meant for a different subset of sinks. 
Note that, while 11^ is the subset of sinks reconstructing X^^, 
Ei may be transmitting packets to many other subsets of sinks. 
Let the packet from source Ei to the subset of sinks JC CU 
be denoted by Vi^ and let it carry information at rate Rix- 

The optimum route for packet Vi^jc from the source to these 
sinks is determined by a spanning tree optimization (minimum 
Steiner tree) |19|. More specifically, for each packet Vi^K,^ the 
optimum route is obtained by minimizing the cost over all trees 
rooted at node i which span all sinks j G /C. The minimum 
cost of transmitting packet Vi^jc with Ri^jc bits from source i 
to the subset of sinks /C, denoted by di{JC) is : 



d^{JC) 



eeQ 



We 



(19) 



where Si^jc denotes the set of all paths from source i to 
the subset of sinks JC. Having obtained the effective cost 
for each packet in the network, our next objective is to find 
an achievable rate region for the tuple {Ri^jc^i G T>,JC C 

^The approach is applicable to more general cost functions. 



n). The minimum communication cost then follows directly 
from a simple linear programming formulation. Note that the 
minimum Steiner tree problem is NP - hard and requires 
approximate algorithms to solve in practice. Also note that in 
theory, each encoder transmits 2^ — 1 packets into the network. 
While in practice we might be able to realize improvements 
over broadcast routing using significantly fewer packets (see 
e.g., Ii31j). 

B. An achievable rate region 

In what follows, we use the shorthand {Ui}s for {Ui^K. ' 
JC e S} and {Ur}s for {Ui^ ' i e L, /C ' G 
S}. Note the difference between {Ui}s and Ui^s- {Ui}s 
is a set of variables, whereas Ui^s is a single vari- 
able. For example, {/7i}(i 2,i2) denotes the set of vari- 
ables (/7i,i, /7i,2, ^1,12) and 2)}(i,2,i2) represents the set 
(^1,1, ^1,2, ^1,12, ^2,1, ^2,2, ^2,12)- 

We first give a formal definition of a block code and an as- 
sociated rate region for DIR. We denote the set {1, 2 ... L} by 
II for any positive integer L. We assume that the source node 
Ei observes the random sequence X^. An (n, Pg, L^^^:; Vz G 
S,/C G 2" - (/)) DIR-code is defined by the following 
mappings: 

• Encoders: 

fF-xr^ n ^^^./c (20) 

• Decoders: 

W n Il.,^^{X-}^, (21) 

Denoting /f (Xf ) = {Tjsn-^ where 1 < Ti^K < U^k, the 
decoder estimates are given by: 

= f^({T^}(K^2-:i^K)) (22) 

Note the correspondence between the encoder-decoder map- 
pings and dispersive information routing. Observe that packet 
Vi^K carries Ti^^c at rate Li^^c from source i to the subset of 
sinks /C. The probability of error is defined as: 



1 

M 



jell 



(23) 



A rate tuple {i^^^^:; Vi, /C} is said to be achievable if for any 
7^ > and < e < 1, there exists a (n, Pg, Li^]c]\/i eT^^JC e 
2^ — (j)) code for n sufficiently large such that. 



Ri,K: < -logPi,K: +^ 
n 

with the probability of error less than e, i.e., 

Pe<e 



(24) 



(25) 



We extend the coding scheme described in section |III-C 
to derive an achievable rate region for the tuple {Ri^jc G 
i;,/C G 2^ — 0) using principles from multiple descriptions 
encoding f2|, jSj, (T^j and Han and Kobayashi decoding jSj, 
albeit with more complex notation. Without loss of generality, 
we assume that every source can send packets to every sink. 
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Before stating the achievable rate region in Theorem [T] we 
define the following subsets of 2^: 

Xw = {JC'.JCe 2", \jc\ = W} 
1w+ = {/C : /C G 2", \JC\ > W} (26) 

Let B be any subset of 11 with \B\ < W. We define the 
following subsets of Xw and Xw-\-- 



Xw{B) = {JC'.JCeXw^BcJC} 
Xw+ (B) = {/C : K:gIw+, BCK.} 

We also define: 

J(5) = {/C: /CG2n, |/Cf|5| > 0} 



(27) 



(28) 



Note that J{Jl) = 2^-0. Let Q be any subset of 2" - <j). We 
say that Q G Q* if it satisfies the following property V/C e Q: 



if /Ce Q ^ %|+(/C) C Q 



(29) 



Let {f/s}_7(n) be any set of N{2^ — 1) random variables 
defined on arbitrary finite alphabets, jointly distributed with 
{Xjs satisfying the following: Vj e 11, 

Pi{X}^,{U^}jU)) = n^({^0^0)l^i) (30) 

The above Markov condition ensures that all the codewords 
which reach a sink are jointly typical with {X}^.. 
We define Q) as: 



aii,Q) = -H{{Ui}Q\Xi) 

+ ^ff(c/i,^|{C/,}x,,„(^)) (31) 
/ceQ 

Vi e S, Q C J(n). We further define /3{k, Qi, Q2,... Qn) 
Vfcen, Qi,Q2,...QN^J{k) as: 

/3(fc, Qi, Q2, . . . Qn) = H {{Ui}Q^^i\{Ui}Qyi) 

-EE H{U.x\mi,^MK)) (32) 

where = J{k) — Qi and define 7/c(r) as : 

7.(r) = i7({X}r|{X}rc,{[/^}^(,)) 

V/cGn,rCE/e (33) 

where = E/^ — P. We state our main result in the following 
Theorem. 

Theorem 1. Achievable Rate Region for DIR :Let {/75]}2n_0 
be any set of random variables satisfying (30). Let {R- Vi G 
E, /C G 2^ — (/)) be any set of auxiliary rate tuples such that: 



(34) 



VQ G Q*. Fwr^/z^^ /^^ (i?J;c Vi G E, /C G 2" - 0) be any set 
of rate tuples such that; 

E E <K>E E Ri,jc+mQi,Q2,---QN) (35) 



• 



f''^i,(M-l,A/) 



(1.2)< 



Fig. 7: Illustrates the order of codebook generation at source 



for each k eU, VQi, Q2, • • • Qat ^ «^(^) satisfying (29) ^-t/c/z 
3z G {1, . . . , AT} : 7^ J{k). Let {Ri^ Vz G E,/C G 



satisfy: 



E 



^ i?i,;c>7fe(r) 



(36) 



V/c G n, F G 2^^ — 0. Then, the achievable rate region for the 
tuple {Ri^s Vz G E, 5 G 2^ — (/)) contains all rates such that, 



R 



> 



(37) 



The convex closure of the achievable tuples over all such 
N(2^ — 1) random variables satisfying {30) is the achievable 
rate region for DIR and is denoted by IZdir- 

Remark 1. The converse to this achievability region does not 
hold in general. A simple counter example follows from the 
famous binary modulo two sum problem proposed by Komer 
and Marton for the 2 helper setup in |32|. However, in section 
[V| we prove the converse for certain special cases. 

Remark 2. The coding scheme in Theorem [T] can be easily spe- 
cialized to 'power binning' by setting {Uy}2^-(I) to constants. 
This effectively becomes the 'no-helpers' scenario as setting 
{^s}2n-0 to constants implies that Ri^s = V5 ^ 2^\ 

Proof: We follow the notation and the notion of strong 
typicality defined in |3|. We refer to |3| (section 3) for formal 
definitions and basic Lemmas associated with typicality. 

Encoding : Suppose we are given {Uy}2^-(I) satisfying (30). 
As in section |in-C| the encoding at each node is divided into 
3 stages: 

1) Stage 1 : We focus on the encoding at source node 
Ei. The codebook generation is done following the or- 
der of JL^k:, |/C| = M,M - 1,M - 2...,1 as shown 

in Fig. M First, 2"^^* " independent codewords of /7i,n, 

u'^uU) j ^ {1 . . . 2^^*'"}, are generated according to the 



density Ylt^i ^Ui^ui^i^u)- Conditioned on each codeword 



ul^{j), 2^^^>^ codewords of Ui^jc : |/C| = M - 1 are gen- 
erated independent of each other according to the conditional 



density Ylt 



Similarly, V/C : \JC\ < M, 
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2^^i,/c codewords of Ui^jc are independently generated condi- 
tioned on each codeword tuple of {Ui}x^^^_^{K:) according to 



) . Note that to gen- 



erate the codewords of Ui^jc, we first need all the codebooks 
of {Ui}x\jc\+iK:)- On observing a sequence, x^, the encoder at 
Ei attempts to find a set of codewords, one for each variable, 
such that they are all jointly typical. If it fails to find such 
a set, it declares an error. Codebooks are generated similarly 
at all the source nodes. Note that all the random variables 
/7i,iVz G S can be set to constants without changing the rate 
region of Theorem [T] However, we continue to use them to 
avoid more complex notation. 

2) Stage 2 : In stage 2, the codewords in each codebook are 
divided into uniform bins. Specifically, the 2^^^^^ codewords 
in any codebook of Ui^k: are subdivided into 2^^^^^ bins, 
with each bin containing 2^*^^^'^"^^'^^ codewords. All the 
codewords which have the same bin index m are said to fall 
in the bin Ci^im) Vm G (1 . . . 2"^^^'^). If in stage 1, the 
encoder succeeds in finding a jointly typical set of codewords, 
the bin index of the codeword ofUi^jc is sent as part of packet 

3) Stage 3 : Power Binning : In this stage, each typical se- 
quence of Xi is assigned 2 1^*1 —1 indices, randomly generated 
using uniform pmfs over (1, . . . , 2^^'^) V/C G 2^^ —0, respec- 
tively. All the sequences of i which have the same bin index 
/ are said to fall in the bin Bi^jcil) V/ G (1 . . . 2^^^'^). On 
observing a sequence x^, if it is typical, the encoder sends the 
corresponding bin indices in the packets Vi^jc • ^ ^ 2^^ — (/), 
in addition to the bin indices in stage 2. If it is not typical, the 
encoder declares an error. Note that all packets from source 
node Ei to a subset of sinks /C such that JC C 2^^ — ^, carry 
two bin indices, one each from stages 2 and 3, respectively. 

In Appendix |a| we show that, if the rates R'- j^ satisfy 



( [34] ), then the probability of encoding error asymptotically 
approaches zero, i.e., we can, with probability approaching 1, 
find a codeword tuple, one from each codebook such that all 
the codewords are jointly typical if the rates satisfy ([34]). Let 
the codewords, which are jointly typical with x^, be denoted 
as li*^ V/C G J7(n) = 2^ — (j). To ensure joint typicality 
of {^|]}j7(/c)), we require a stronger version of the 

"conditional Markov lemma" in fTOl. We state and prove this 
stronger version, called the "conditional Markov lemma for 
mutual covering" in Appendix B. From this lemma, it follows 
that e %"{{X}l,{U^}J^k)) with very 

high probability given that the encoding at all the source nodes 
is error free. Let the bin indices of ^ (assigned in stage 2) 
be denoted by rrii^x: V/C G 2^ — and let the bin indices of 
(assigned in stage 3) be denoted by /i,K:V/C G 2^^ — (j). 
Decoding : We focus on a particular sink Sk- Sink Sk re- 
ceives all the indices {^s}j^(/c) of stage 2 of encoding from all 
source nodes. It also receives {l^k}jik) of stage 3 of encoding 
from source nodes S/c. In the first stage of decoding, it begins 
decoding j^^^-^ Vi G H by looking for a unique jointly 
typical codeword tuple from {C^^j^(/c)(m^ Vi G H}. 

Clearly {u'^}j(^k) satisfies this property. If the decoder finds 
another such jointly typical codeword tuple in the received 
bins, it declares an error. In Appendix [A| we show that if 



conditions (^35l are satisfied by R^ ^, then the probability that 
the decoder nnds another such jointly typical codeword tuple 
approaches zero. 

In the last stage of decoding, after having decoded all 
the decoder looks for unique source sequences from 
rM^ixQ^ix) ' ^ ^ ^/c^/C 3 k} which are jointly typical with 
{u^}j{k)- Hence what remains is to find conditions on Ri^jc to 
ensure lossless reconstruction of the respective sources at each 
sink. Following similar steps as in f3l, |[4|, it is easy to show 
that this probability can be made arbitrarily small if ([36]) is 
satisfied VF G 2^^ — (j). We have shown that if the rates satisfy 
the conditions in Theorem [T] the probability of decoding 
error at each sink can be made arbitrarily small. Arbitrarily 
small decoding error ensures that the decoder decodes the 
correct sequence with very high probability. Hence, if the 
rate constraints are satisfied, for any e > 0, we can find a 
sufficiently large n such that: 



(38) 



Recall that packets from source node Ei to sinks /C C 
carry both m^^^: (at rate R- ^) and li^]c (at rate Ri^Tc)- While 
the other packets carry only mi^jc (at rate Ri j^)- Hence, the 
rates of each packet must satisfy the following constraints for 
lossless decoding of the requested sources: 




Ri^jc if/CC2n^-, 
if /C ^ 2"^ - , 



(39) 



proving the theorem. ■ 

Remark 3. A note on separability of distributed compres- 
sion and routing : It was shown in 1 1 1 that the two problems 
of DSC (Slepian-Wolf compression) and optimum broadcast 
routing are separable problems, i.e., the optimum routes can be 
found without the knowledge of the achievable rates, and vice 
versa, the rate region can be found without the knowledge 
of the routes. However, we demonstrated in that such 
separability holds only under the 'no helpers' assumption. We 
also showed that the extent of suboptimality due to separating 
DSC and broadcast routing is substantial and potentially 
unbounded when helpers are allowed to communicate. In 
general the optimum rate region cannot be found without 
the knowledge of the network costs for broadcast routing. 
However, for DIR, the two problems of finding the optimum 
rate region for the tuple {Ri^jc G i;,/C G 2^ — 0) and 
finding the optimum routes from the source nodes to the sinks 
can be separated and dealt independently, without entailing any 
loss of optimality. Note that even though DIR has the inherent 
advantage of separability, finding the optimum operating point 
requires optimizing over an x 2^ dimensional space and 
the effective complexity remains the same as that for broadcast 
routing. 

V. OUTERBOUNDS TO CERTAIN SPECIAL SCENARIOS 

We note that the converse to the achievability region does 
not hold in general. However, we can prove the converse for 
two important special cases. 
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Fig. 8: Example of a 2- sink, 1 Helper DIR 



A. When there are no helpers 

Theorem 2. When each sink is allowed to receive packets 
only from sources it intends to reconstruct, the complete rate 
region for dispersive information routing is given by: Vj G 11 
and MS e 2^^ - 6: 



(40) 



E E R^,^>H{{X}s\{XU,-s) 

It is achieved by 'Power Binning \ 

Proof: In the achievable rate region of Theorem [T] setting 
Ui^s = ^ Vi G 11,5 G 2^ - 0, where ^ is a constant, 
leads to the above rate region. The converse to this rate region 
follows directly from the converse to the lossless source coding 
theorem |[26|. We omit the proof as it is straightforward. ■ 



B. A 2 -Sink network with a single helper 

The converse can be proven in general for any 2 sink 
network with a single helper. However, to avoid complex 
notation, we just give a simple example of a 2 sink network 
with a single helper and prove the converse to the rate region. 
The proof of converse for a general 2 sink network with a 
single helper follows in similar lines. 

Consider the network shown in Fig. [8] with 3 source 
nodes and 2 sinks. The three source nodes Ei^Eq^E2 observe 
three correlated memory less random sequences ,Xq ,X2^, 
respectively. The two sinks Si and 5*2 respectively reconstruct 
and X2 losslessly. Note that £^0 acts as a helper to both 
the sinks. Our objective is to find the rate region for the 
tuple i?2, ^0,1, ^0,2, ^o,{i,2}) for lossless reconstruction 
of the respective sources. It is important to remember that 
our ultimate objective is to find the minimum communication 
cost, which follows by finding the point in the rate region that 
minimizes the following cost function: 

CmR = WiiRi + W22R2 + {Wo + Wi)Ro,i 

HWo + W2)Ro,2 + {Wo + 1^1 + W2)Ro,{l,2} (41) 

The following theorem establishes the complete rate region. 

Theorem 3. Let (/To, /7i, be random variables distributed 
over arbitrary finite sets Uo xUiX U2, jointly distributed with 
(Xi,Xo,X2) such that the following hold: 



Xi ^ Xo^(/7o,t/i,t/2) 
X2 ^ Xo^(/7o,/7i,/72) 



Then any rate tuple satisfying the following constraints is 
achievable for the 2-Sink 1 -Helper DIR problem: 



Rq,12 


> 


I{Xo;Uo) 


Ro,i 


> 


I{Xo;Ui\Uo) 


Ro,2 


> 


I{Xo;U2\Uo) 


Ri,i 


> 


H{X,\Uo,Ui) 


R2,2 


> 


H{X2\Uo,U2) 



(43) 



The closure of the achievable rates over all such {Uo, Ui, U2) 
is the complete rate region for this setup. 

Proof: AchievabUity : Let {Uo,Ui^U2) be any random 
variables satisfying ( [42| ). The following achievable rate region 
is obtained by setting /7o,i2 = Uo, Uo^i = /7i, /7o,i2 = U2 and 
all the remaining random variables to constants in the general 
achievable rate region of Theorem [T] 





^0,12 


> 


I{Xo;Uo) 


^0,12 


"f ^0,1 


> 


/(Xo;/7o)+/(Xo;/7i|/7o) 


^0,12 


"f ^0,2 


> 


/(Xo;/7o)+/(Xo;/72|/7o) 


^0,12 + ^0,1 


"f ^0,2 


> 


I{Xo;Ui,U2,Uo)^I{Ui;U2\Uo) 




^1,1 


> 


H{Xr\Uo^Ui) 




^2,2 


> 


H{X2\Uo,U2) (44) 



We further restrict the joint density to satisfy the following 
Markov condition in addition to (|42l): 



(45) 



Ui^{Xo,Uo)^U2 



On using this Markov condition in (44 ), the sum rate constraint 

on Ro,i2 + ^0,1 + ^0,2 becomes: 



Ro,i2 + Ro,i + Ro,2 > I{Xo] Uo) + /(Xo; Ui\Uo) 

+/(Xo;[/2|/7o) 



(46) 



Observe that if a rate tuple satisfies ([43]), then it also satisfies 



( 44 ) and hence the region given by ( [43] ) is achievable for the 
2-Sink 1 -Helper problem shown in Fig. [8] 

Converse : Recall the notation in the definition of an 



achievable rate region in Section IV-B The output of encoder 
1 is denoted f^{X^) = Ti and the output of encoder 2 
is /2^(XJ) = T2. Remember that < Ti < 2^^^ and 
< Ti < 2''^\ Similarly the encoder at Eo transmits 3 
indices denoted by (To,i, To,2, ^0,12) which are routed to the 
respective sinks. Sink receives (Ti, To,i, To,i2) and recon- 
structs X^ with vanishing probability of error. Similarly sink 
S2 receives (T2, To, 2, ^0,12) and reconstructs X2 losslessly. 
We need to prove that for any code with vanishing probability 



(42) 



of error, the rates must satisfy (43) for some (/7o, /7i, [/2) 
satisfying ( [42] ). 

We follow standard converse techniques to prove the above 
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claim. We begin with the following series of inequalities: 
ni?o,i2 > H{To,i2) > I{X^;To,u) 

n 



n 

i=l 
n 

i=l 



(47) 



where (a) follows from the memory less property of the 
sources and {b) follows by setting Uq 12 — (^0,12, -^o'*~^)- 
Here Xq denotes the i'th realization of Xq and denotes 
the first i — 1 realizations of Xq. Next we have: 

nRo,i > i^(To,i) >i/(To,i|To,i2) 
> /(X-;To,i|To,i2) 



n 

= ^/(XS;C/^,i|C/^,i2) (48) 
Where Uq i = (^0,1) Vi Similarly, we can show that: 

n 

ni?o,2>E^(^o;t^o.2|t^o.i2) (49) 



where Uq 2 



= (^0,2) Vi Note that as 
(To,i,To,2,To,i2,Xo'''-') depends on (Xi,X^) only through 
Xq, we have the following two Markov chain conditions: 



(50) 



Further, we need lossless reconstraction of X" at ^i. The 
following series of inequalities hold: 

nRi > H{Ti) 

> if(Ti|To,i2,To,i) 

= H{T^\To,i2,To,i) + H{X^\To,i2,To,i,T,) 
— iJ(X"|To,i2, ?o,i,Ti) 
>(a) i/(xf,Ti|To,i2,To,i)-ne„ 
= H{X^\To,i2,To,i)-n€n 



= ^if(Xi|Xi-\To,i2,To,i)-ne„ 

2 = 1 

n 

= ^i/(Xi|C/^,i2,C/^,i)-ne„ (51) 

where (a) follows from Fano's inequality, i.e., 
|Ti,To,i,To,i2) < ne^. Similarly, for lossless 
reconstruction at 5*2, we have: 



nR2>^H{X^2\Ki2.K2)-ner, 



(52) 



We next introduce a time sharing random variable Q ~ 
Unif[l : n], independent of {X^ , X^ , X^ ,U^^^,U^ 2.^^,12)^ 
so that we can rewrite (|47]), (|48]), (|45]), ([51]) and ([52| as: 

ni?o,i2 > IiX^'.U^,i2\Q) = HX^'.U^,i2.Q) 

nRo,i > I{X^'.U^,i\U^,i2.Q) 

= /(Xo^;t/o^i,Q|t/o%,Q) 

nRo,2 > HX^'.U^,2\U^,i2.Q) 

= ^(^o^;<2,QI<i2,Q) 

nR, > H{X?\U^,2^U^A^Q) 



nR2 > H{X^\U^^,2^U^^2^Q) 



(53) 



Setting (/7o%,Q) = /7o,i2, (^oJ,Q) = ^0,1, (^0^2, Q) = 
C/0,2 and observing that (X^,X^,X^) has the same density 
as (Xo, Xl, X2) we get the rate region given in (43 ). ■ 
Example to demonstrate strict improvement: Next we 
show that DIR achieves strictly lower communication cost 
for the single helper network shown in Fig. [S] This example 
demonstrates the freedom DIR provides over broadcast routing 
by sending only the relevant information to each sink, even 
when the information is from a helper. The complete rate 
region under broadcast routing for the example shown in Fig. [8] 
was determined in |14| , p3| and is given by the closure of the 
following rate tuples over all random variables Uq satisfying 
(Xi,X2)^Xo^/7o: 



^0,12 
^1,1 

R2,2 



> I{Xo;Uo) 

> H{X,\Uo) 

> H{X2\Uo) 



(54) 



We consider the example where (Xo,Xi,X2) are binary 
symmetric sources such that Xi ^ Xq ^ X2 holds. The 
transition probabilities are such that Xi and X2 are obtained 
as outputs of two independent binary symmetric channels 
with Xo as input and cross-over probabilities of Pi and P2, 
respectively. Let us say that the network costs are such that Ei 
and E2 send at rates A more than their respective conditional 
entropies (for some A > 0), i.e., Ri = i^5(Pi) + A and 
R2 = Hi){P2) -\- A where i^6(-) denotes the binary entropy 
function (note that the conditional entropy is the minimum 
information each encoder has to send). Wyner p4| (see also 
(34I ) showed that the minimum rate from Eq to the two sinks 
under broadcast routing is given by: 



Ro> max 1 - Hb{Po) 



(55) 



i=l 



where Pqi and P02 solve the respective equations i^6(Pi • 
Poi) = H,{Pi) + A and H,{P2 • P02) = ^^5(^2) + A 
where Pi • P2 = P1P2 + (1 - A)(l - P2). The optimum 
Uq which achieves the boundary points is obtained by passing 
Xo through a binary symmetric channel (BSC) with cross over 
probability Pq. Again observe that, if the sinks 5*1 and 5*2 
receive information from Ei and E2 at rates Hi, (Pi) + A 
and i^5(P2) + A, they require information from Eq at rates 
1— Hi) (Pqi) and 1— i75(Po2), respectively. However, broadcast 
routing sends information at the maximum of the two to both 
sinks and hence if Pi 7^ P2 (which in turn implies Pqi 7^ P02 
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in general), there is sub-optimality on either one of the two 
branches connecting from the collector to the two sinks. 

On the other hand, using DIR, we can achieve minimum 
rates on all the branches. To prove this claim, without loss of 
generality, let us assume that 0.5 > Pqi > P02 > 0. Consider 
the following joint density for (t/o, ^i, ^2) in Theorem [s] U2 
is the output when Xq is sent through a BSC with cross over 
probability P02 and [/q is the output when /72 is sent through a 
BSC with cross over probability Pqi 2 where P02 •P012 = Poi- 
/7i is set as a constant. It is easy to verify from Theorem |3] 
that the following rates are achievable: 

^0,12 = 1 — Hi){Pqi) 

Ro,2 = H,{Poi) - Hb{Po2) (56) 

which implies that the two sinks receive at their respective 
minima leading to the conclusion that DIR achieves the 
minimum communication cost for this example. 

VI. Conclusion 

This paper considers a new routing paradigm called dis- 
persive information routing, wherein each intermediate node 
is allowed to "split a packet" and forward subsets of the 
information on individual forward paths. We demonstrated 
using simple examples the gains of DIR over broadcast 
routing. Unlike network coding, this new routing technique 
can be realized using conventional routers with source nodes 
transmitting multiple smaller packets into the network. This 
paradigm introduces a new class of information theoretic 
problems. We derived an achievable rate region for this setup 
using principles from multiple descriptions encoding and Han 
and Kobayashi decoding which is complete for certain special 
cases of the setup. 
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Appendix 

Appendix A: Bounding Encoding/Decoding Errors 
IN Theorem[T] 

Proof: 
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Probability of encoding error : Let us analyze the proba- 
bility of encoding error at source node Ei. Let 8 denote the 
event of an encoding error. We have: 

+p(f i Tnp{x7 i rn (57) 

From standard typicality arguments, we have P{x2 ^ TJ^) 
as n ^ oo. Hence, it is sufficient to find conditions on the 
rates to bound P(f|x^ ^V"). 

Towards finding conditions on the rate to bound P{S\x'^ G 
TJ^), we define the random variables : 



X{{j} 



j{n)) 



1 if (x»,<({j}^(n))) gt;" 



else 



(58) 



We have P(f G T^) = P(^ = 0) where ^ = 
'^j{u)X{{j}j{u))- From Chebyshev's inequality, it follows 
that: 

P{^ = 0)<P[\^-Em>Em/2]<^^l (59) 
From Lemma 3.1 in |3|, we can bound E[^] as follows: 

where 

a{i,Q) = -H{{Ui}Q\Xi) 

+ ^ff(c/i,^|{C/ax,,,+(^)) (61) 
/cee 

Vi, Q C J{Il). We follow the convention aw{i, 4>) = 0- Next 
consider Var(«') = E[<i/^] - {E[<i/]f where, 

E{^']= E E E[xi{j}jiuM{k}j(u))] 

{j}j(u) {k}j(u) 

= E E P[xi{j}jiu)) = l,xmJin)) = l] (62) 

{j}j(n) {k}j(u) 



The probability in (62) depends on whether ^i^({j}j7(n)) 
and W^iikjji^u)) are equal for a subset of indices. Let 
Q ^ J(n), Q 7^ 0, such that {j}Q = {/c}q. Observe that, 
due to the hierarchical structure in the conditional codebook 
generation mechanism, for u2{{j}Q) = ^i^({/c}q) to hold, Q 
must be such that. 



if /Cg Q X\n+{^ c Q 



(63) 



i.e., Q G Q* given in ( [29] ). It follows from the codebook gen- 
eration mechanism that given the codeword tuple {u^{{j}Q)}, 
tuples {<({j}:7(n)-Q)} and {<({^}:7(n)-Q)} are indepen- 
dent and identically distributed. Hence we can rewrite the 
probability in ([62]) for some Q C J7(n), Q 7^ ^, as: 



p[m}jiu))r\£mj^^n))] 



pmj]Q)] ) 
^pmj)Q)] (64) 



However, note that if Q = 0, then: 
P Wj]jin))f^£mji.n))\ = {P [^({J}^(n))])' (65) 



Next, the total number of ways of choosing and 
{k}j{ii) such that they overlap in the subset Q is: 



On substituting (64) and (66) in (62), we bound Var(^) as: 



Var(^) < ^|2"^^("^^'^^"^^"^^e^w ^^^^) 



(67) 



where the summation is over all non-empty Q such that ( 63 ) 
holds. Observe that the term corresponding to Q = ^ gets 
canceled with the '(£^[^])^' term in Var(^). Inserting, (67) 



and (60) in ([59]), we get : 



+7ne 



(68) 



where the summation is over all non-empty Q satisfying (63). 
Hence, the probability of encoding error at all the source nodes 
can be made arbitrarily small if: 



/CGQ 



(69) 



Vz, Q satisfying (63). 

Probability of decoding error : We focus on decoding at 
sink Sk- We first bound the probability of error for the first 
stage of decoding. The decoder looks for a unique codeword 
tuple from \ {CY}j{k) [{^T}j{k)) } which are jointly typical. 
We know that {u^}j(^k) are jointly typical from the Markov 
Lemma in Appendix B. We have to find conditions on R- ^ 
to ensure no other tuple satisfies this property. Denote by 
the event of a decoding error given the encoding is error- free. 
Due to the symmetry in codebook generation, we can assume 
that the index tuple of {u^}j^k) is (1, • • • , 1). Let {j^}j^k) 
be an index tuple such that: 

{is}:r(fc)7^(l,...,l) (70) 
Define the event J^{{jT}j{k)) as: 

H{h]jik)) = { G (71) 

It then follows from union bound that: 

p(:f) < E^(-^({Js}^(fc))) (72) 

where the summation is over all {jT^j^k) 7^ (l^---^!)- 
However, a subset of indices of {jT^j^k) ean still be equal 
to 1. We expand the above summation over all such possible 



subsets. Let Qi, Q2, • • • Qat ^ J{k) satisfying (63) be such 
that the following hold^ 



3zG{l,2,...,7V}:Q, C J(A:) 



(73) 



^ Again observe that it is sufficient for us to consider Q^s which satisfy 
J63l due to the hierarchical structure of the conditional codebook generation. 
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i.e., at least one of the Q^'s is a strict subset of J{k). Define 
the set: 



j{k) 



I ji,K = 1 if /C G Qi 



[jiX 1 otherwise 

Then, we can expand ([72]) as: 
where the first summation is over all {Qi, . . . , Qat : Qi ^ 



(74) 



satisfying ( |63| ) and (73)} and the second summa- 
tion is over all {jT}j{k) ^ ^Qi,Q2,.- Q7v ^^^^ ^^at, 
due to the conditional independence of the codewords gen- 
erated, P{J^{{jT}j{k))) is the same for all {jT}j{k) ^ 
^(Qi,Q2,...Qiv), i.e., P{J^{{3T}j{k))) depends only on 
Qi, Q2, . . . Qat. We can bound P{F) as: 



P(^(Q,;VzGS))} 



(75) 



where P(^(Qi;Vz G S)) =P(^(Qi, Q2, • • • , Qiv)) de- 
notes P(J^({js}:7(fe))) for some {jY.}j{k) ^ ^Qi^Q^.-Qn 
and the summation is over all {Qi,...,QAr : Qi ^ 
J{k)^ satisfying ( [63] ) and ([73])}. We next bound the individ- 
ual terms in the above product. Recall that each of the bins 
Ci^s{') have 2''^^-^s-Ris) codewords. Using Lemma 3.1 jsj, 
we can bound both the terms in the above product as: 



P(^(Q,;VzgS)) < 



I^Q.,Q.,...qJ < 2 -^--^^ (76) 

where Q^^ = J{k) - Qi. Substituting ^ in ( [75] ), 

it follows that P{T) can be made arbitrarily small if: 
VQi, Q2, . . . Qat C J(fe) satisfying ([63| and (|73l. 



E E <E E w.d{^jx,.„(^)) 

(77) 

where Q^^ = J {k) — Qi. On plugging in the bounds for i?- ^ 
from ( [69] ) into ( [77] ), we get ( [35] ) in Theorem [T] 



Appendix B: Conditional Markov Lemma - For 
Mutual Covering 

It was shown in fjj^that if a codeword of Ui (denoted by 
[/f ) is selected jointly typical with and a codeword of 
U2 (denoted by U2) is selected jointly typical with X2 and 
if Ui^Xi^X2^ U2, then (/7i*, , X^, U^) are jointly 
typical. This is called the generalized Markov lemma and is 
depicted in Fig. [9a] Similarly, Wagner et al. 1 10] considered 
the case in which codewords of Un and U22 are generated 

"^We note that an earlier Markov Lemma proof appeared in |9|. However 
the proof in 1 3 1 is easily extendible to more general settings as it is based on 
standard typicality arguments. 



conditioned on codewords of Ui and U2, respectively. They 
showed that if a pair of codewords of {Ui^Un) (denoted 
by (t^i,t^ii)) are jointly typical with and a pair of 
codewords of (/72,^22) (denoted by (/7|,/7|2)) typical 
with X^, and if {Ui^Un) ^ Xi ^ X2 ^ (t/2,^22), then 
(/7i*,/7i*i,Xf,X^,C/2*.C/2*2) are jointly typical. This is called 
the conditional Markov lemma for obvious reasons and is 



depicted in Fig. 9b However, these results are not sufficient 
for our scenario and we require a stronger version of the 
conditional Markov lemma. In what follows, we will establish 
a series of lemmas, culminating with the needed variant called 
the conditional Markov lemma for mutual covering (Lemma 
[3]). Note that these lemmas can be easily extended to more 
than 2 random variables and layers of encoding. However, we 
restrict ourselves to the 2 variable case to keep the notation 
simple. We also note that the lemmas and proofs here are 
applicable to more general contexts beyond DIR. 

Lemma 1. Let random variables (F, [/, ^1,^2) be given and 
let y"" e ViY). Let the subset Bo{y'') C TJ'(U\y'') be such 
that: 

^n(H(U\Y)-X) < < 2"(^^(C^|y)+A) (73) 

for some A > 0. For every G Bo{y^), let subset 
Bi2{y'',u'') c TJ'{(yi,V2)\u'') be such that: 

2n{H{Vry2\U,Y)-\) ^ \Bi2{y'' , u"")] < 2^(^(^i'^2|f/,n+A) 

(79) 

and the following hold: 

2n(H(Vi\U,Y)-\) ^ < 2"(^^(Vi |C/,F) + A) 

2n{H(V2\U,Y)-\) ^ < 2"(^(^2|(7,F) + A) 

2n{H{V^\U,Y,V2)-X) < < 2"(^(^i l'^'^'^2) + A) 

2n(H(V2\U,Y,V^)-\) ^ |_B2(y»^y"^^")| < 2"(^^(^2|(7,F,Vi) + A) 

(80) 

where V(vf,w^) G Bi2(y",w").- 



{V2 



K,f2)eBi2(y",w")} 
K,«2")e5i2(y",w")} 
3K,«2")GBi2(y",w")} 
3K,t;2")ei?i2(y",M")}(8i) 



Let Rq , Ri and R2 be given positive rates. Let Uj {j = 
1,...,2^^°) be random variables drawn independently 
and uniformly from TJ^{U). For each Uj, let V jj^{k = 
1, . . . , 2"^^^) and V jj.{k = 1, . . . , 2"^^^) random variables 
drawn independently and uniformly from TJ^{yi\Uj) and 
TJ^{V2\Uj), respectively. Then for n sufficiently large, 

P {U fci, fc2 : U, e Bo{y"), (t^A,, t^/fcj G Bu{y",Uj)) 

<S{e) 
(82) 
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XiO Of^> x^o ^y^L^f^^^ 



Of/: 



22 



X2O- Or^2 X2O 

(a) Generalized Markov Lemma jsj (b) Conditional Markov Lemma JlO] 

Fig. 9: Depicts the different Markov lemmas. 




(c) Conditional Markov Lemma - for Mutual Cov- 
ering 



where S{e) ^ as e ^ 0, if the rates Rq,Ri and R2 satisfy: where 

Rq > I(Y- [/) + 7A + 19e 
Rq^Ri > /(r;Fi,/7) + 8A + 17e 
Ro^R2 > /(r;F2,/7)+8A + 17e 
Ro^Ri^R2 > I{Y',VuV2,U)^I{Vi;V2\U) 

+6A + 15e (83) 



Proof: Define the random variable ^j^kiM : 



3 MM 



else 



(84) 

Denote by A' = Xlj,/ci,/c2 "^JMM- Observe that the probability 
in (82) is equal to P{X = 0). From Chebychev's inequality, 
we have: 



P(A' = 0) < 



4Var(A^) 



(85) 



Next we have the following from ( [78] ) and (79): 

^(a) 2^(^o+i?i+i?2) p(;t'^ 

9n(//(yi,y2,t/|^))-2A-5e) 



2n(//(f/)+//(yi|f/)+//(y2|t/)) 

(86) 



where equality in (a) holds because the random variables Uj, 

— 1 — 2 

Vjj^ and Vjj^ are drawn independently and uniformly from 
their respective typical sets. Also, using ( [79| ) and ( [80| , we can 
bound E[X'^] as: 

Al h.1 h.1 a2 h.2 h.2 
J i'^2 -J '1 ' 2 

^ 2n{Ro+Ri+R2) 2n{Ro+2R^+2R2) 

^2^(^o+2i?i+i?2)p^ ^ 2^(^o+i?i+2i?2)p^ 
^22n(i?o+i?i+i?2)p2 ^g-y^j 



Pi 
P4 



2n(//(yi,y2,t/|l^)+2A+5e) 

2n(//(c/)+i/(yi|c/)+//(y2|t/)) 

2n(//(C/|y)+2/f(yi,y2|f/,^)+3A+9e) 

2n(//(f/)+2//(yi|t/)+2Jf(y2|t/)) 

2n(//(yi,f/|y)+2//(y2|t/,y,Vi)+4A+7e) 
2n{H{ViM)+'2H{V2\U)) 

2n(2i/(yi|f/,y,y2)+i^(V2,t/|y))+4A+7e) 



2n(2/f(yi|c/)+/f(y2,f/)) 
On substituting (I86]),(l87]) and ([88]) in ([85]), we have: 



(88) 



P{X = {))< 

^2-^(^o+i?i-/(V;Vi,C/)-8A-17e) 
^2-^(^o+i?2-/(V;y2,t/)-8A-17e) 

J^2-<Ro+Rl+R2-I{Y^y^y2M)-I{v^■y2\u)-Q\-lbe)']^^^^ 

which can be made arbitrarily small if the rates satisfy ( [83] ). 

■ 

Lemma 2. Let F, [/, Vi and V2 be random variables with 
values in finite sets W, ^i ^2, respectively. Let W* 
be a random variable with values in W^, such that: 



(90) 



Let Rq, Ri and R2 be given positive rates. Let Uil'^J^^ 
denote independent random variables chosen uniformly with 
replacement from TJ'iU). Let V^j{i = l,...,2^o, j = 
l,...,2^^i) andvlj{i = l,...,2^o, j = 1...2^^2) 
random variables drawn independently and uniformly from 
ViVilUi) and TJ'{V2\Ui), respectively Wi. Further, let, 

P(W\Y'',U'',V^,V^ e Tr{W,Y,U,VuV2)) >l-r] 

(91) 

Also, suppose Vvf G TJ^iVi) and e ^{¥2): 



P{{W'',Y'',U'',V^) eTr\V^ =v'^) >l-r] 



(92) 
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Then for n sufficiently large, there exists functions U*{y^), 
V{{y\U*) and V2*{f,U*), such that: 

i) f/^y") =_ Ui (for some i _e {!,..., 2^«}K 
V{iy",U*) = V^j,, V2*{y^,U*) = for some ji e 

{!,..., 2"-«i} and j2 € {1,...,2"^^} 

ii) 

P{{W*,Y", U*, Vi*, V2*)eZ>l- (5(e) 
P{{W*,Y'',U*,V*) e Te\V2*) > 1 -(5(e) 
P{{W*,Y'',U*,V2*) €Te\V{) >l-S{e) (93) 

for some 5{e) Q as e ^ Q, if the rates Rq,R\ and R2 
satisfy: 

Ro > I{Y; U) + 40e 

Ro + Ri > /(r;Vi,C/)+41e 

R0 + R2 > I{Y;V2,U) + 41e (94) 

R0 + R1+R2 > I{Y;VuV2,U) + I{Vi;V2\U) + 33e 
Proof: Let us expand ( (9T] i as: 



(95) 



Let, 



> 1 



and 



Ao^Af]mY) 



(96) 



Then using the reverse Markov inequality, we can show that 
(similar to (3), pO|): 

P(r" eAo)>l-5i (97) 

where (5i = ^ + e. Then for any y" e Aq, we have: 

^{p((W^*,yi",F2") e 77|r" =y",[/" = w") 

P(^/7^ = = I > 1 - (98) 

Let, 

>1-^} 

Po=pn^"(^i^") ^9^) 

Using the reverse Markov inequality, we again have: 

p(^/7^ G Po(^'') = > 1 - (^2 (100) 

where 62 = v^ + e. Hence for any G Aq and ^x"^ G Bo{y^) 
we have: 

^ p(Fi^ =<,F2'' =^2k'' =^'',^'' =^'') 

,V'^ 

Q(y^^i^<,^2")>l-^ (101) 



where we denote by Q{S)= P(^VI^* G 7;^(V^|5) 
for any set of sequences S. Note that we have used the 
Markov condition ( [9Q| ) in the above equation. Now define 
sets Bi2{y^-,u^) and Bi2{y^,u^) for any G and 
G Po(^'') such that: 

Pl2(^^^i") = {«,^2") : g(^^^i^<,^2") > i - ^} 

Pi2(^") = Pi2(^") n ^1, ^2|y") (102) 
Then using the reverse Markov inequaUty, we can show that: 



p({V^.V^)eB^2{yn 



> 1 



-^3 

(103) 

where 63 = ^^e. Then from ( |1QQ[ ), ( |1Q3| ) and Lemma 3.1(f) 
in 1 3 1, for n sufficiently large, we have: 

2niHiU\Y)-3e) ^ \Bo{y'')\ < 2^(^(^l^)+^) 
2niHiV^,V2\Y,U)-3e) ^ | (y"" , ^i"") | < 2^(^(^i '^^ |^,t/)+e) 

(104) 

Note that we have two of the sets required by Lemma 
[T] However, we further require bounds on the projections 
of Bi2{y^^u^) (as in (80)) to invoke Lemma [T] Towards 
obtaining these bounds, we note that the following inequalities 
can be shown directly from ( [9T] ): 



P ((W*, r^, /7^, eTr)>l-r] 

p ((ly*, r^, /7^, v:^) eTr)>i-r] 



(105) 



Expanding ( |105| ) instead of ( [9T] ) and repeating all steps from 
([95]) through (|1Q4|), we obtain: 



2niHiV,\Y,U)-3e) < |52(y",^i")| < 2"(^(^^ 1^'^)+^) (106) 

where 

i?i(y",u") = : 3W,t;2") e Bi2(y",ti")} 

i?2(y", w") = {t^2" : ^K, v^) G Bi2(y", w")} (107) 

Similarly, it is easy to show that expanding ([92]) instead of 
( [9T] ) leads to: 

2n(i/(yi|y,C/,y2)-3e) ^ ^^)| < 2^(^(^il^'^'^2)+e) 

2n(if(y2|l^,f/,Vi)-3e) ^ |^2(y'',2x'',0| < 2^(^(^2 1 

(108) 

where V< G Pi(?/^,ii^) and G B2{y'',u^), 

B^{y^,u^,v^) = K : W,^2) ^ ^i2(^",^i")} 
P2(^",^",0 = K • W,^2) e Pi2(y",^")} (109) 

We now have sets Bq and Pi 2 satisfying all the bounds as 
required in Lemma [T] Hence, we can define the functions 
/7*, V{ and ^2* as follows. /7*(^^) = Ui if Ui G Po(^''). If no 
such Ui exists, we set U'^{y'^) = Ui. Next, if there exists a pair 
(Kh'Kn) such that {vI^„vIJ e Bi2(y",C/,), then de- 
fine {V{{y", U*),V2*{y^, U*)) = (F^^ , fJ^J. If Aere exists 
no such pair, define (Vi*(y", [/*), V2*(y", C/*)) = (F-,i, F,- 1). 
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It follows from the rate conditions in ([94]), Lemma [T] with 
A = 3e and the bounds on set sizes that: 



>l-S{e) 

(110) 



for some 6{e) as e -> 0. Note that G Aq, 

e Bo{Y'') and (Fi*,F2*) ^ ^12(1''', t/*) imply that 
(^^, Vi*, ^2*) ^ ^, ^1, ^2). We then have, 

P(iy*,r^,/7*,Vi*,F2* ^ ^J") > P{Ei)P{E2\Ei) (111) 
where events £^1 and £^2 are defined as: 



where S{e) ^ as e ^ if the rates satisfy: 



El 
E2 



{r^GAo,/7*G5o,(n*,^2*)e^i2} 

{w* GT;^(iy|r^,/7*,Fi*,F2*)} (112) 



From (|97]), ( |1Q3| ) and ( |1Q2| ), we obtain bounds on P(£;i) and 
P(^2|^i): 



p(£;i) > 1-S1-S2 

P{E2\Ei) > 1-^ 



(113) 



On substituting in ( |111| ), we obtain the first bound in (93). 
The other two bounds in ( [93] ) can be shown using similar 
arguments. ■ 



Lemma 3. Conditional Markov Lemma - for Mutual 
Covering.- Suppose that (Xi, X2, f/i, [/2, t/12, ^21, ^22) 
are random variables taking values in arbitrary finite sets 
(^1,^2,^1,^2,^11,^12,^21,^22). respectively. Let the ran- 
dom variables satisfy the following Markov condition: 



{Uu Uu, U12) ^ Xi ^ X2 ^ ([/2, U2UU22) 



(114) 



Let Ui^i : i = l,...,2^^i and U2,i : i = 1,...,2^^2 
be independent codewords of length n each generated us- 
ing the marginals P{Ui) and P{U2), respectively. Let 
2^^ii and 2^^!^ codewords of Uu and U12 (denoted 
by Uii^ij and Ui2,ij), respectively, be generated condi- 
tioned on each codeword Ui^i. Similarly generate code- 
words of U21 and U22 (^t rates R21 and R22, respec- 
tively, conditioned on the codewords of U2. Then for n 
sufficiently large, there exists functions /7^(X{^),/7|(X2 ), 
C/i*! (Xf , f/f ), C/i*2 (Xf , f/f ), [72*1 , C/2* ) and U^^ {X^ , ) 
taking values in Ui .,U2 M11M12M21 ^22' respectively, 
such that: 



Ri 

R2 





Ri 


> 




Ui), 




R2 


> 


I{X2 


U2) 


Ri- 




> 




Uu,Ui), 


Ri- 




> 


I{Xi 


f/l2,f/l), 


R2 - 


h J?21 


> 


I{X2 


U21,U2), 


R2 ~ 


1- R22 


> 


I{X2 


U22,U2), 


Ru - 


hi?12 


> 


I{Xi 


UnUi2,Ui) 



R22 + ^21 > 



+/(/7n;/7i2|/7i), 

I{X2;U2UU22.U2) 

-^I{U2i;U22\U2) 



(116) 



Note that this lemma can be easily extended to the more 
general case of multiple random variables and multiple layers 
of encoding using induction (see |3 1 for the general method- 
ology). While we use the more general version in the proof 
of Theorem [T] in Appendix [A] we restrict to the simpler case 
here for ease of understanding and to avoid complex notation. 

Proof: We note that from standard arguments |[2j, jSj, 
1 35 1, it follows that if the rates satisfy ( |116| ), then there exists 
functions /7i*(Xf ), U^iiX^, U^) and /7i*2(Xf , U^) such that: 



P ((Xr, /7*, /7*2) eTn>l- ^(e) 



(117) 



for some S{e) ^ as e ^ 0. Also, note that X2 is drawn 
according to the right conditional PMF given Xf . Hence, we 
have: 

p((x^x2^/7*,/7^l,/7^2) e rn > i - s{e) ms) 

What remains for us to show is that there exists functions 
/72*(X2^), /72*l(X2^/72*) and U^2{X^.U^)^ taking values in 
U^M2iM22^ jointly typical with Xf , X2^ , [/* , , [7*2 . We 
invoke Lemma |2] with W = (Xf , Ul, U^^, UI2). ^ = X^, 
U = U2, Vi = U21 and V2 = t/22. Note that given (|116|) and 



[ |118| ), conditions ([9Q|,([9T]) and ( [92] ) are satisfied (for a formal 
proof of this claim, refer to |35|). Hence, it follows from 
Lemma [2] that there exist functions U^{X^), U^i{X^,U^) 
and U^2iX^^ U^) such that: 

P ((X-, X2^ Ul, U^2. ^1*1, ^1*2, ^2*1, ^2*2) e 7;") > 1 - 5{e) 

(119) 

thus proving the lemma. ■ 



P {{X^, X^, U^,U^, C/*i, C/*2, C/2*i, U^2) e > 1 - (5(e) 

(115) 



